Building a Gold Standard for Thai WordNet

نویسندگان

  • Dhanon Leenoi
  • Thepchai Supnithi
  • Wirote Aroonmanakun
چکیده

This paper presents a method of building a gold standard test set of Thai WordNet. The results of this research can be utilised for evaluating or comparing the results from different approaches of Thai WordNet construction. In this research, a part of Thai WordNet is carefully handcrafted from Common Base Concepts’ FirstOrderEntities with five translation resources. However, we found that to build a gold standard test set is not easy as finding words that can fit to the definition of synsets; cultural gaps between the different languages have to be aware of.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BabelNet: Building a Very Large Multilingual Semantic Network

In this paper we present BabelNet – a very large, wide-coverage multilingual semantic network. The resource is automatically constructed by means of a methodology that integrates lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition Machine Translation is also applied to enrich the resource with lexical information for all languages. We conduct experiments on new and ...

متن کامل

What implementation and translation teach us: the case of semantic similarity measures in wordnets

Wordnet::Similarity is an important instrument used for many applications. It has been available for a while as a toolkit for English and it has been frequently tested on English gold standards. In this paper, we describe how we constructed a Dutch gold standard that matches the English gold standard as closely as possible. We also re-implemented the WordNet::Similarity package to be able to de...

متن کامل

Semantic Similarity Measures for the Development of Thai Dialog System

Semantic similarity plays an important role in a number of applications including information extraction, information retrieval, document clustering and ontology learning. Most work has concentrated on English and other European languages. However, for the Thai language, there has been no research about word semantic similarity. This paper presents an experiment and benchmark data sets investig...

متن کامل

Thai WordNet Construction

This paper describes semi-automatic construction of Thai WordNet and the applied method for Asian wordNet. Based on the Princeton WordNet, we develop a method in generating a WordNet by using an existing bi-lingual dictionary. We align the PWN synset to a bilingual dictionary through the English equivalent and its part-of-speech (POS), automatically. Manual translation is also employed after th...

متن کامل

Senseval-3 task: Word Sense Disambiguation of WordNet glosses

The SENSEVAL-3 task to perform word-sense disambiguation of WordNet glosses was designed to encourage development of technology to make use of standard lexical resources. The task was based on the availability of sensedisambiguated hand-tagged glosses created in the eXtended WordNet project. The hand-tagged glosses provided a “gold standard” for judging the performance of automated disambiguati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008